28 research outputs found

    A cloud solution for multi-omics data integration

    Get PDF

    The Genome Conformation As an Integrator of Multi-Omic Data: The Example of Damage Spreading in Cancer.

    Get PDF
    Publicly available multi-omic databases, in particular if associated with medical annotations, are rich resources with the potential to lead a rapid transition from high-throughput molecular biology experiments to better clinical outcomes for patients. In this work, we propose a model for multi-omic data integration (i.e., genetic variations, gene expression, genome conformation, and epigenetic patterns), which exploits a multi-layer network approach to analyse, visualize, and obtain insights from such biological information, in order to use achieved results at a macroscopic level. Using this representation, we can describe how driver and passenger mutations accumulate during the development of diseases providing, for example, a tool able to characterize the evolution of cancer. Indeed, our test case concerns the MCF-7 breast cancer cell line, before and after the stimulation with estrogen, since many datasets are available for this case study. In particular, the integration of data about cancer mutations, gene functional annotations, genome conformation, epigenetic patterns, gene expression, and metabolic pathways in our multi-layer representation will allow a better interpretation of the mechanisms behind a complex disease such as cancer. Thanks to this multi-layer approach, we focus on the interplay of chromatin conformation and cancer mutations in different pathways, such as metabolic processes, that are very important for tumor development. Working on this model, a variance analysis can be implemented to identify normal variations within each omics and to characterize, by contrast, variations that can be accounted to pathological samples compared to normal ones. This integrative model can be used to identify novel biomarkers and to provide innovative omic-based guidelines for treating many diseases, improving the efficacy of decision trees currently used in clinic

    Memory-Optimised Parallel Processing of Hi-C Data

    Get PDF
    Abstract—This paper presents the optimisation efforts on the creation of a graph-based mapping representation of gene adjacency. The method is based on the Hi-C process, starting from Next Generation Sequencing data, and it analyses a huge amount of static data in order to produce maps for one or more genes. Straightforward parallelisation of this scheme does not yield acceptable performance on multicore architectures since the scalability is rather limited due to the memory bound nature of the problem. This work focuses on the memory optimisations that can be applied to the graph construction algorithm and its (complex) data structures to derive a cache-oblivious algorithm and eventually to improve the memory bandwidth utilisation. We used as running example NuChart-II, a tool for annotation and statistic analysis of Hi-C data that creates a gene-centric neigh-borhood graph. The proposed approach, which is exemplified for Hi-C, addresses several common issue in the parallelisation of memory bound algorithms for multicore. Results show that the proposed approach is able to increase the parallel speedup from 7x to 22x (on a 32-core platform). Finally, the proposed C++ implementation outperforms the first R NuChart prototype, by which it was not possible to complete the graph generation because of strong memory-saturation problems. I

    L’ACQUISIZIONE DEL SISTEMA VOCALICO ITALIANO: UNO STUDIO ACUSTICO SU PARLANTI GERMANOFONI E ROMENOFONI

    Get PDF
    Questo studio si concentra sul vocalismo tonico di 12 apprendenti adulti di italiano L2 – 6 germanofoni e 6 romenofoni – che hanno vissuto almeno 6 mesi in Italia, paragonati a un numero congruo di soggetti madrelingua italiani. I primi risultati mostrano l’effetto dell’equivalence classification nel caso della pronuncia delle vocali semi-aperte italiane da parte degli apprendenti romeni. Dalle analisi acustiche è evidente come il loro grado di apertura risulti spesso semi-chiuso. Questo ci indica che i locutori romeni non sono sempre in grado di discriminare le opposizioni /e - É›/ e /o - É”/, d’altronde contrasti senza valore fonematico nella loro L1. Gli apprendenti tedeschi, invece, non hanno nessuna difficoltĂ  a produrre le vocali semi-aperte. A livello acustico, per alcuni locutori tedeschi è stata evidenziata una sottile differenza nella pronuncia delle /e/ e /u/ italiane, anche se statisticamente non significativa. Usando modelli stocastici e tecniche di clustering si è potuto osservare che generalmente una pronuncia piĂą simile a quella dei parlanti italiani si ha nel caso degli apprendenti che vivono da piĂą di 48 mesi in Italia, che usano principalmente l’italiano per comunicare, che hanno ricevuto lezioni esplicite di pronuncia e che hanno un livello alto di competenze linguistiche in italiano. L’etĂ  della prima esposizione all’italiano non risulta statisticamente significativa. Questo dato potrebbe indicare una rilevanza maggiore della qualitĂ  e della quantitĂ  dell’input nell’apprendimento dei tratti segmentali, con implicazioni anche nell’insegnamento dell’italiano a stranieri.   The acquisition of the Italian vowel system: an instrumental study on German and Romanian learners This paper investigates the acquisition of stressed vowels in L2 Italian by 6 German and 6 Romanian adult learners, who had lived in Italy for at least 6 months at the time they were recorded. In order to ensure comparability and make inferences, we examined them in comparison to 6 native Italian speakers in the same age range. The first result of this study indicates a notable effect of “equivalence classification” regarding the Romanian learners’ pronunciation of the Italian open-mid vowels. Fine-grained acoustic analyses reveal that their degree of vowel openness is often mid-close. This suggests that Romanian learners may not be able to easily perceive and produce the /e - É›/ and /o - É”/ contrasts, which in fact have no phonemic value in their L1. German learners, however, experience no difficulty with open-mid vowels. On another note, acoustically, it seems that pronunciation of Italian [e] and [u] displays peculiar values for some German speakers, even if these differences are not statistically significant. Statistical models and clustering techniques show that, generally, learners produce Italian vowels more accurately if they have lived in Italy for at least 48 months, if they mainly use Italian to communicate, if they have received specific pronunciation training in Italian and if they have a high level of language skills in Italian. The age of onset does not seem to have any statistically significant effect on the correctness of vowel pronunciation. This fact might suggest a relevant role of input quality and quantity for vowel acquisition, which could have implications on teaching Italian to L2 learners

    Parallel Stochastic Simulators in System Biology: The Evolution of the Species

    Get PDF
    Abstract—The stochastic simulation of biological systems is an increasingly popular technique in Bioinformatics. It is often an enlightening technique, especially for multi-stable systems which dynamics can be hardly captured with ordinary differential equations. To be effective, stochastic simulations should be supported by powerful statistical analysis tools. The simulation-analysis workflow may however result in being computationally expensive, thus compromising the interactivity required in model tuning. In this work we advocate the high-level design of simulators for stochastic systems as a vehicle for building efficient and portable parallel simulators. In particular, the Calculus of Wrapped Components (CWC) simulator, which is designed according to the FastFlow’s pattern-based approach, is presented and discussed in this work. FastFlow has been extended to support also clusters of multi-cores with minimal coding effort, assessing the portability of the approach. Keywords-Parallel patterns; multi-core; distributed computing; stochastic simulation; systems biology. I
    corecore